Detecting differential rater functioning over time (DRIFT) using a Rasch multi-faceted rating scale model.
نویسندگان
چکیده
This paper describes a class of rater effects that depict rater-by-time interactions. We refer to this class of rater effects as DRIFT differential rater functioning over time. This article describes several types of DRIFT (primacy/recency, differential centrality/extremism, and practice/fatigue) and Rasch measurement procedures designed to identify these types of DRIFT in rating data. These procedures are applied to simulated data and are shown to be useful in classifying raters as being aberrant or non-aberrant for primacy, recency, and differential centrality and extremism, particularly for moderate or larger effect sizes. Rates of correct classification for practice and fatigue were lower and statistical power exceeded.50 only with very large effect sizes. Type I error rates (i.e., incorrect nomination) were near expected levels in all cases.
منابع مشابه
A Study of Raters’ Behavior in Scoring L2 Speaking Performance: Using Rater Discussion as a Training Tool
The studies conducted so far on the effectiveness of resolution methods including the discussion method in resolving discrepancies in rating have yielded mixed results. What is left unnoticed in the literature is the potential of discussion to be used as a training tool rather than a resolution method. The present study addresses this research gap by exploring the data coming from rating behavi...
متن کاملDetecting and measuring rater effects using many-facet Rasch measurement: part I.
The purpose of this two-part paper is to introduce researchers to the many-facet Rasch measurement (MFRM) approach for detecting and measuring rater effects. The researcher will learn how to use the Facets (Linacre, 2001) computer program to study five effects: leniency/severity, central tendency, randomness, halo, and differential leniency/severity. Part 1 of the paper provides critical backgr...
متن کاملRater Errors among Peer-Assessors: Applying the Many-Facet Rasch Measurement Model
In this study, the researcher used the many-facet Rasch measurement model (MFRM) to detect two pervasive rater errors among peer-assessors rating EFL essays. The researcher also compared the ratings of peer-assessors to those of teacher assessors to gain a clearer understanding of the ratings of peer-assessors. To that end, the researcher used a fully crossed design in which all peer-assessors ...
متن کاملDiagnostic Writing Assessment: the Development and Validation of a Rating Scale
Alderson (2005) suggests that diagnostic tests should identify strengths and weaknesses in learners' use of language, focus on specific elements rather than global abilities and provide detailed feedback to stakeholders. However, rating scales used in performance assessment have been repeatedly criticized for being imprecise, for using impressionistic terminology (Fulcher, 2003; Upshur & Turner...
متن کاملMany-Facet Rasch Measurement
This chapter provides an introductory overview of many-facet Rasch measurement (MFRM). Broadly speaking, MFRM refers to a class of measurement models that extend the basic Rasch model by incorporating more variables (or facets) than the two that are typically included in a test (i.e., examinees and items), such as raters, scoring criteria, and tasks. Throughout the chapter, a sample of rating d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of applied measurement
دوره 2 3 شماره
صفحات -
تاریخ انتشار 2001